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AN ANALYSIS OF THEMATIC MAPPER SIMULATOR DATA 
COLLECTED OVER EASTERN NORTH DAKOTA 


I . SUMMARY 


This report presents results of the analysis of aircraft-acquired 
Thematic Mapper Simulator (TMS) data, collected in August 1980 as part of 
the AgRISTARS Domestic Crops and Land Cover (DCLC) Project. The investi- 
gations presented in this document were carried out under the Sensor 
Implementation and Evaluation Research element of the project. The over- 
riding thrust of the research reported herein was to investigate the 
utility of Thematic Mapper (TM) data, through simulation, in crop area 
and land cover estimates. 

Results of the analysis indicate that the seven-channel TMS data are 
capable of delineating the 13 crop types included in the study to an over- 
all pixel classification accuracy of 80.97% correct, with relative effi- 
ciencies for four crop types examined between 1.62 and 26.61. 

Both supervised and unsupervised spectral signature development tech- 
niques as developed at NASA/NSTL/ERL were evaluated. The unsupervised 
methods proved to be inferior (based on analysis of variance) for the 
majority of crop types considered. Given the ground truth data set used 
for spectral signature development as well as evaluation of performance, 
it is possible to demonstrate which signature development technique 
would produce the highest percent correct classification for each crop 
type. 

II. INTRODUCTION 

The purpose of this report is to present results obtained from the 
analysis of TMS digital data collected in August 1980 over the Walsh 
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County area in the drainage basin of the Red River Valley in eastern North 
Dakota (ND site). The data collected represent only a portion of the data 
which will be included in the total DCLC Project. Subsequent reports will 
deal with the analysis of TMS data collected over other study sites for 
additional crop land cover types. 

The work conducted in this investigation falls under the Sensor 
Implementation and Evaluation Research element of the project. The specific 
area of research is contained in Task 4.7.1, Thematic Mapper Procedure 
Development. The overall objectives of the task are (1) to provide an 
evaluation of the anticipated utility of the TM for crop and land cover 
estimates, and (2) to provide software/procedure development for the analysis 
of the TM data. 

This portion of the DCLC Project is conducted as a cooperative research 
effort between the National Aeronautics and Space Administration (NASA), 

Earth Resources Laboratory (ERL), located at the National Space Technology 
Laboratories (NSTL), and the United States Department of Agriculture, 

Statistical Reporting Service (SRS). The ERL collected and analyzed the TMS 
data, while SRS supplied the registered segment (ground truth) data, upon 
which performance evaluations were made. 

III. THEMATIC MAPPER SIMULATOR DATA 

Data used in this study were obtained by an airborne TMS scanner system 
(see Appendix A for TMS specifications). The TMS was designed to produce data 
with spectral and spatial characteristics (Figure 1) similar to those of the 
TM scanner, which will be on board Landsat D (scheduled for launch in late FY82). 
The TM will have spectral resolution of 30-m (100-ft) spatial resolution in 
channels 1, 2, 3, 4, 5, and 7 and 120-m (about 396 ft) resolution in channel 6. 
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Figure 1. Spectral Wavelength Characteristics of TMS and 
Landsat MSS Systems 
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Figure 1 also presents the spectral resolution of currently available Landsat 
MSS data, as well as a generalized green leaf reflectance curve (after Knipling, 
ref. 1) for comparison of the two sensor systems. While the TMS channels are 
numbered in order of their occurrence in the electromagnetic spectrum from 
short (blue) to long (IR) wavelengths, the channels of the TM do not follow 
this system. The reader should be aware of this channel numbering difference 
while reading this report. (TMS channel 6 is equivalent to TM channel 7; 
all others are identical.) 

TMS data were collected on August 11, 1980, from an altitude of 12,000 m 
(39,370 ft) above mean terrain elevation. With a 2.5-milliradian aperture, 
this resulted in a spatial resolution (at nadir) of 30 x 30 m (100 x 100 ft) 
for channels 1 through 6, and 120 x 120 m (394 x 394 ft) for channel 7. The 
TMS scanned through a 50-degree angle on either side of nadir, but data 
processing and analysis were restricted to 30 degrees on either side of nadir 
to provide a closer simulation of TM data. Data collected by the TMS are 
subsequently converted by the scanner to an 8 bit (256 levels of gray) digital 
format for use in data processing and analysis activities. The data were 
viewed on an image display device, and examined for radiometric fidelity and 
the presence of abnormal data values (detector noise, dropouts, loss of sync, 
etc.). Since the spatial resolution of channel 7 is four times as coarse 
as the other six channels, it contains only one-sixteenth the number of pixels. 
This situation was rectified by expanding the data for channel 7 by repeating 
each "pixel" in channel 7 three times in both the scan line and element 
directions. This resulted in blocks of 16 pixels (four by four pixels in size) 
each containing the radiometric value of the initial channel 7 pixel. In this 
manner, a channel-to-channel registration with the other six channels was 
performed, while at the same time the geometric relationship of channel 7 to the 
other six channels (four-to-one) was preserved. 
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When all problems had been corrected, the center 60 degrees of the data 
were examined for sun angle/angle-of-look related trends. No such problem 
existed with this data set, as the aircraft data collection flight occurred 
"into the sun" and was within one half hour of solar noon. 

IV. GROUND TRUTH 

Field enumeration in SRS segments served as the ground-truth data source 
(ref. 2). An SRS segment is a parcel of land (at the subcounty level) delineated 
by natural or recognizable boundaries which is used to make statistical estimates 
about agricultural commodities. Segments are chosen by random selection procedures 
from an area frame stratified by general land uses. For the ND site, regular 
SRS segments were approximately 1 sq. mi. in area, containing from three-to- 
eight land cover types. The segment data represent a random sampling of the 
major crop cover types of interest within the ND site. Numerous fields within 
segments were used for each crop cover type of interest to ensure the statistical 
reliability of results obtained. 

Each segment was visited in the field during the 1980 June enumerative 
survey (ref. 2) by trained field personnel, who recorded the boundaries of 
each field and the land cover/land use. Several additional "mini-segments" 
were visited on 10 August 1980. Such sites were established to provide 
additional fields for training purposes for matching with remotely sensed 
data. Segment and field boundaries were drawn onto small-scale vertical 
photography. Ground truth information corresponding to these segments was 
placed into a ground truth book and filed for later comparison with the TMS 
data analysis results. 

For the five flight lines of TMS data collected, a total of 15 segments 
(including mini-segments) were used in the study, representing approximately 
2,064 ha (5,100 acres) total. This network of ground truth included fields 
representing wasteland (non-cropped) , sunflowers, spring wheat, sugar beets. 
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"other" crops, alfalfa, barley, potatoes, corn, dry beans, flax, durum wheat, 
and summer fallow. 

V. DATA PROCESSING AND ANALYSIS 

The initial phase of data processing dealt with registering the segment 
data to the seven channels of TMS data already located on a data file. This 
was done by geographically registering the TMS data to a map, and subsequently 
overlaying the geographically registered segment data to the map-registered 
TMS data. TMS data-to-map registration was accomplished using ERL software 
developed for that purpose (ref 3). The registered data were subsequently sent 
to SRS for the segment- to-TMS data registration. 

After SRS had completed the segment-to-TMS overlay, the data were sent back 
to the ERL for analysis. Using a color image display device and aerial photo- 
graphy, each segment was examined. It was determined that the land cover in 
several fields had been modified since the fields had been visited on the 
ground. The most prominent change had resulted from harvesting operations 
which removed the crop cover and left stubble, etc. Thus, numerous fields 
were edited and renamed from the specific crop types to a generic "fall fallow" 
crop type. All flax fields had been harvested by August 11. Thus, flax does 
not appear in the remainder of this report. However, all other crop types 
were represented by fields in the edited segment file. 

One other significant problem was encountered with the segment file. It 
was found that several segments were not registered to the TMS data very well, 
presumably due to the instability of the aircraft platform at the time of 
data collection. Thus, when the data were registered to a map, certain areas 
did not fit well. This problem was corrected by simply re-establishing the 
segment locations in the TMS data and editing the file sent by SRS. 

The combined effect of harvesting and poor registration of segments can 
be seen in Figure 2, which presents "before editing" and "after editing" 
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Figure 2. Before Editing and After Editing Distributions for 
IMS Data Corresponding to Barley Crop Type 







distributions for the IMS data corresponding to the barley crop type. As can 
be seen, both the multimodal and large variance tendencies for all channels of 
data are dramatically changed by editing, producing a much more uniform 
spectral distribution for barley. Similar results were noticed for other 
crop types affected by the same problems. Edited segment data were then 
copied (as an additional channel) into the computer file containing the IMS 
data (Appendix B). 

VI. SUPERVISED SPECTRAL SIGNATURE DEVELOPMENT 

After preprocessing the TMS digitial data and registering and editing the 
segment data, the next step in the investigation was to develop supervised 
spectral signatures for each of the crop types present. Spectral signatures 
were developed through the use of software which uses a directional index 
table approach (MUCS - ref. 3). This software can be instructed to examine 
one channel of data (a mask file) and to develop spectral signatures from other 
channels of data (data file) for specified values in the mask file. The channel 
containing the SRS segment data was used as the mask file, and the software 
was instructed to develop a seven-channel spectral signature for the edited 
crop types. Since every crop type in the "SRS segment" channel had been 
assigned a unique value, spectral signatures were developed for each crop type 
individually. This technique was required because, in some instances, boundary 
pixels of fields within segments had to be eliminated. Where harvesting activities 
had modified the condition of the crop present, the harvested areas were used 
to develop supervised spectral signatures. These were added to those defining 
crop types of interest to the SRS. A total of 19 spectral signatures developed 
in this manner were stored in a computer disk file for later use in a quadratic 
maximum likelihood classifier (WMAX - ref. 3). 
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VII. UNSUPERVISED APPROACH TO SPECTRAL SIGNATURE DEVELOPMENT 

In addition to the supervised spectral signature development approach 
already mentioned, an unsupervised technique was examined. Fundamentally, 
unsupervised spectral signature development differs from supervised techniques 
in that unsupervised techniques "scan" the entire data set and, within limits 
established by the investigator, develop spectral signatures defining 
spectrally distinct features without prior knowledge of the land cover types 
which are contained within the data set. It then becomes a matter of relating 
the spectral signatures developed to actual land cover present, using aerial 
photographs, ground truth, a portion of the segment data for each crop cover 
type, and an image display device. The signatures developed were used to 
classify a portion of the ground truth set. Then, based on the manner in 
which each signature classified the various crop types present, each was 
assigned a "label." The label was identified to that of the crop type most 
frequently classified by each signature. Once the spectral signature/land 
cover relationships have been established, performance can be evaluated. 

Of the various techniques for unsupervised spectral signature development 
found in the literature, point clustering was used for this study. Point 
clustering techniques (e.g., WCCL, PTCL, - ref. 3) develop spectral signatures 
by examining individual pixels of data, with the frequency of sampling 
normally input by the user. As each point is examined, a decision is made 
as to whether the new pixel is spectrally similar to points already examined. 

If similar, it is grouped with the similar pixel (s). If not, it remains as a 
separate spectral signature, and the next pixel in the data file is examined. 

The process continues until all data have been processed. 

Various parameter settings of the unsupervised software were tried and the 
results of subsequent maximum likelihood classification were compared with the 
results obtained from the supervised approach outlined earlier in this document. 
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VIII. LAND COVER ESTIMATION 


Acreage estimates were obtained using the USDA/EDITOR system. Classifi- 
cation results from the ERL system were edited into the "segment total file" 
created during direct expansion estimation. This file was then processed 
using the USDA system. 

Several problems were encountered in deriving estimates for the various 
land covers. Five of the 15 segments were located in Minnesota, so these 
segments were reassigned to Walsh County, North Dakota. A second problem was 
that nine segments were mini -segments, which should be handled differently 
than the normal JES segment; however, splitting out these mini -segments would 
not leave enough segments for estimation. To alleviate this problem, the 
ground data and classification acres for the mini-segments were multiplied 
by four so that a total of 15 equal size segments from Walsh County, stratum 
11, could be used in estimation. 

Due to the problems discussed above and because of limited ground data 
after editing, the direct expansion and regression estimates computed from 
this data set are not statistically sound. Enough sample segments were 
available to compute the correlation between the ground truth and classification 
data for soybeans, potatoes, spring wheat, and sunflowers. These R-square 

p 

(R ) values were used to compute the relative efficiency (RE) using the fol- 
lowing formula: 


The RE measures the improvement in terms of increased precision of the 
regression estimate, which combines both ground and Landsat data, over the 
direct expansion estimate, which utilizes only ground data (ref. 4). 

IX. RESULTS AND DISCUSSION 

As shown in Table 1, the results obtained from the supervised approach 
(based on 19 signatures) were significantly better in most cases than the 
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Table 1. Percent Correct Classification 
TMS Data. 

Values North Dakota 

7-Channel 

LAND COVER 
CROP (SRS) 

NUMBER OF PIXELS 
EVALUATED 

SUPERVISED 

(MUCS) 

UNSUPERVISED 

(WCCL) 

1. 

Wasteland 

190 

57.78 

73.34* 

2. 

Sunflowers 

91 

47.37 

84.72* 

3. 

Spring Wheat 

342 

72.99 

84.41* 

4. 

Sugar Beets 

56 

69.85 

75.98 

5. 

Other Crops 

46 

87.90 

82.86 

6. 

Alfalfa 

45 

85.93* 

63.97 

7. 

Barley 

38 

91.67* 

81.25 

8. 

Potatoes 

234 

90.87* 

68.23 

9. 

Corn 

53 

90.68* 

89.53 

10. 

Beans 

50 

69.57* 

35.81 

11. 

Durum Wheat 

50 

91.53* 

61.58 

12. 

Sunmer Fallow 

61 

97.10* 

89.86 

13. 

Fall Fallow 

268 

90.46 

86.93 


"Overall" 


80.97* 

79.54 


♦Significantly better statistically than corresponding value for this 
cover type 
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unsupervised approach (based on 59 signatures). Comparisons made were 
accomplished through analysis of variance and Newman-Keuls analysis using 
the arcsin-^ transformation. It is of interest to note that no matter 
which of the techniques is used, ground truth polygons must be established 
to relate the spectral signatures to land cover types, and an independent 
set of data should be used to evaluate the performance of the final results 
produced. Such areas must be spectrally homogeneous in order to prevent the 
introduction of error into the analysis. Since this is the case, the work 
required to incorporate ground truth into the data analysis framework is 
the same for supervised and unsupervised approaches. In this respect, the 
SRS segments represent a convenient source of data useful for both spectral 
signature development/naming and evaluation of performance. Values that are 
significantly better statistically than the corresponding values of the 
alternative procedure for a given cover type are indicated with an asterisk 
in Table 1. For instance, point clustering produced higher accuracy values for 
wasteland, sunflowers, and spring wheat, and performed (statistically) as 
well for sugar beets, other crops, and fall fallow. 

In each case where a statistically significant difference exists in 
favor of the unsupervised approach, an analysis of the frequency distribu- 
tion of the raw TMS data corresponding to that crop type is enlightening. 

For instance, wasteland as defined by the SRS contains all land within a 
segment which is not dedicated to the production of agronomic crops (Figure 
3). This would include such features as buildings, roads, ditches, trees, 
water bodies, lawns, and highway medians, etc. With such an amalgamation of 
spectrally diverse land covers into one crop type, supervised spectral 
signature development techniques (one signature for each crop type) cannot be 
expected to perform very well. Unsupervised techniques, like point clustering, 
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Figure 3. Wasteland Data Distribution 




were designed to work in this type of environment, since they were created to 
develop spectral signatures which might better define subclasses within hetero- 
geneous land cover types (such as "wasteland" as defined in the context of 
this study). The sunflowers and spring wheat crops were nearing maturity and 
were heterogeneous to some extent. This manifested itself in somewhat broad 
distributions of raw data for several of the TMS channels. Supervised 
statistics developed for such data would have large variances in numerous 
channels, and performance based on such variance would not be expected to be 
good. 

In the other crop types, the TMS data were very "clean," with well defined 
distributions and relatively small variance (Figure 4). In these cases, the 
supervised approach did well, as these conditions are those assumed for 
supervised spectral signature development. This can be seen in the percent 
correct figures listed in Table 1. 

The Relative Efficiencies for the four crops are given in Table 2. The 
re's range from 1.6 to 26.6, which indicates the estimates obtained by 
incorporating the TMS data with ground truth show a significant reduction in 
the variance. SRS considers any RE above 1.5 indicative of a significant 
reduction in variance, and hence an improvement in the technique used to derive 
land cover estimates. 

X. CONCLUSIONS 

Based on results of the TMS data analysis conducted in this study, the 
following conclusions may be made: 

1. Simulated Thematic Mapper digital data and a supervised spectral signature 
development procedure performed to an overall level of 80.97% correct for 
the 13 crop types of interest in the Walsh County, NO, study area. 

2. Performance was significantly affected by the choice of the spectral 
signature development technique for specific crop types. The 
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Table 2. Relative Efficiencies for Four Crops. 


CROP 

Sugarbeets 
Potatoes 
Spring Wheat 
Sunfl owers 


RELATIVE EFFICIENCY 
1.62 

26.61 

4.37 

2.09 


supervised technique was significantly better for seven cover 
types, the unsupervised (point clustering) technique was sig- 
nificantly better for three cover types, and there was no 
significant difference between the two techniques for three 
cover types. Overall (for all 13 cover types), the supervised 
techniques performed significantly better. 

3. The selection of the signature development technique should be 
determined by the spectral homogeneity of the crop of interest, 
as measured by variance (SRS uses a different technique for 
developing signatures). 

4. Relative Efficiencies calculated for sugar beets, potatoes, spring 
wheat, and sunflowers ranged from 1.62 to 26.61, indicating a 
significant reduction in variance attributable to the use of TMS 
data. 
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The TMS is a modified Texas Instruments RS-18 scanner with a 2.5m 
Instantaneous Field of View mounted in a Gates-Learjet 23/24 aircraft. 
Operational altitudes normally are at or near 12,000m (39,370 ft) above 
mean terrain elevation, resulting in a 30m nadir spot size. 

Bands 1-4 of the TMS consist of individual silicon detectors, and 
receive incoming energy through a combination scanning mirror/modified 
Cassegrainian telescope assembly. The incoming energy is passed through 
a dichroic beam splitter which separates out the longer wavelength (infrared) 
energy. After passing through a collimating lens assembly, the short 
wavelength energy is directed onto individual fiber optics, located on the 
focal plane of the final lens assembly, which transmit the energy to 
individual detectors. Bands 5 and 6 utilize germanium and indium antimonide 
detectors, respectively, to sample energy in the intermediate wavelength 
(mid-IR) region, with optics similar to bands 1-4, although optimized for 
energy of 1.3p - 3.0y wavelengths. Band 7 utilizes a restrictive filter 

and a mercury-cadmium-telluride detector assembly to measure incident energy 
in the thermal IR region. 
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APPENDIX B 

ORIGINAL AND EDITED SRS SEGMENT CROP TYPE ACREAGES 
FOR THE 15 SEGMENTS USED IN THIS STUDY 
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EDITED 

GROUND TRUTH 


COMMENTS 


SRS* 

SEGMENT CROP GROUND TRUTH 


830 

Boundary 

6.48 


Waste 

14.58 


Sp. Wht 

17.55 


Beets 

16.20 

816 

Boundary 

3.78 


Waste 

23.40 


Sp. Wht 

5.67 


Other 

55.53 

6599 

Waste 

17.46 


Sunflowers 

10.53 


Sp. Wht 

118.26 


Beets 

27.90 


Barley 

43.56 


Dur. Wht 

27.99 

819 

Boundary 

1.89 


Waste 

3.33 


Sp. Wht 

12.87 


Barley 

25.20 

818 

Boundary 

0 


Alfalfa 

59.40 

831 

Boundary 

0.18 


Waste 

0.09 


Sum. Fallow 

55.44 

7543 

Boundary 

20.07 


Waste 

14.22 


Sun Flowers 

43.29 


Sp. Wht 

81.81 


Barley 

72.81 


Dry Beans 

24.48 

821 

Boundary 

0 


Waste 

0 


Sp. Wht 

52.74 

9167 

Boundary 

10.26 


Waste 

7.56 


Sp. Wht 

59.22 


Potatoes 

128.97 


Corn 

56.16 


10.26 ha 
14.58 
5.67 
0 

24.30 (Fall Fallow) 

14.40 Reinterpreted 

25.56 
2.16 

44.64 

1.62 (Fall Fallow) 

49.23 

9.27 

29.34 

25.83 

25.92 

15.93 

90.18 (Fall Fallow) 

13.14 Segment Shifted 

12.42 
6.12 
0 

11.61 (Fall Fallow) 

12.69 

46.71 

6.03 

0.0 

49.68 

57.60 Reinterpreted 

15.84 

15.57 

0 

0 

0 


167.67 (Fall Fallow) 


1.08 

Reinterpreted 

2.07 


49.59 


13.59 

Segment Relocated 

7.56 


57.96 


128.97 


54.09 



*T raining pixels only. 
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824/825 

Boundary 

4.59 

10.80 




Sp. Wht 

35.10 

0 




Potatoes 

20.34 

18.99 

20.24 (Fall 

Fallow) 


7035 

Boundary 

16.20 

123.75 


Clouds & Shadows 


Waste 

5.58 

3.60 


over some fields 


Sp. Wht 

57.87 

0 




Beets 

22.50 

11.25 




Barley 

37.44 

0 




Potatoes 

99.36 

89.73 




Dry Beans 

4.05 

3.96 

10.71 (Fall 

Fallow) 


822 

Boundary 

1.44 

10.89 




Waste 

0.27 

0.27 




Sp. Wht 

56.61 

13.86 

33.30 (Fall 

Fallow) 


7024 

Boundary 

29.88 

76.95 


Clouds 


Waste 

31.23 

3.51 




Sunflowers 

22.05 

17.28 




Sp. Wht 

122.76 

54.63 




Dry Beans 

32.31 

31.23 




Flax 

15.21 

0 






69.84 (Fall Fallow) 


178 

Boundary 

19.44 

36.99 


Segment relocated 


Waste 

17.28 

33.57 


in data 


Sunflowers 

34.65 

27.99 




Sp. Wht 

115.92 

94.41 




Beets 

21.60 

15.75 



832 

Boundary 

0 

6.93 




Sp. Wht 

58.41 

40.77 

10.71 (Fall 

Fallow) 
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